An Over-partitioning Scheme for Parallel Sorting on Clusters with Processors Running at different Speeds
نویسندگان
چکیده
In this work we introduce a new algorithm for in-core parallel sorting integer keys which is based on the overpartitioning scheme introduced by Li and Sevcik [1]. The algorithm is devoted to clusters with processors running at different speeds i.e. correlated by a multiplicative constant factor. We compare experimentally this approach with another one related to the Parallel Sorting by Regular Sampling [2] that we have augmented to deal with the case of clusters with processors at different speeds. The metric used for the comparison is the sublist expansion metric that measures the load balancing of the algorithm. What is important in our work is the load balance factor, not (yet) the execution time. It is clear that improved load balance leads to improved execution time. The results we have obtained demonstrate that load balancing for the case of computers with heterogeneous processing capacity is more challenging than for the homogeneous case.
منابع مشابه
Methods for Partitioning Data to Improve Parallel Execution Time for Sorting on Heterogeneous Clusters
The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For uniformly related processors (processors speeds are related by a constant factor), we develop a constant time technique for mastering processor load and executio...
متن کاملTowards Parallel Sorting with Sampling Techniques on non Homogeneous Clusters
In this note we introduce some parallel in−core technique for sorting integer keys which is based on the regular sampling technique. We sketch an algorithm which is devoted to clusters of non homogeneous processors (the speeds of processors and/or the speeds to access distributed disks in the clusters are correlated by a multiplicative constant factor and/or the bandwidth of the underlying netw...
متن کاملPerformance Evaluation of Static and Dynamic Load Balancing Schemes for a Parallel Computational Fluid Dynamics Software (CFD) Application (FLUENT) Distributed across Clusters of Heterogeneous Symmetric Multiprocessor Systems
Computational Fluid Dynamics (CFD) applications are “highly parallelizable” and can be distributed across a cluster of computers. However, because computation time can vary with the distributed part (mesh), the system loads are unpredictable and processors can have widely different computation speeds. Load balancing (and thus computational efficiency) across a heterogeneous cluster of processor...
متن کاملA Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver
In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...
متن کاملOptimizing a CFD Fortran code for GRID Computing
Computations on clusters and computational GRIDS encounter similar situations where the processors used have different speeds and local RAM. In order to have efficient computations with processors of different speeds and local RAM, load balancing is necessary. That is, faster processors are given more work or larger domains to compute than the slower processors so that all processors finish the...
متن کامل